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Abstract 

Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from 
experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this 
reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, 
observational data of the organism in steady state (e.g., wild-type) are more readily available, but their informational 
content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources 
for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying 
directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first 
step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining 
an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. 
In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based 
method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational 
experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms 
competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a 
consistent estimate of the regulatory network. 
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Introduction 

The ability to reconstruct cellular networks plays an important 
role in our understanding of how genes interact with each other 
and the way information flows through them to regulate their 
expression levels. Such reconstructions heavily depend on the 
input data employed. The availability of data on the response of 
the cell to perturbations -either by knocking out or sUencing genes- 
offers the possibility for improved network reconstructions and 
constitutes a key input in functional genomics. As pointed out in 
[1], high-dimensional phenotypic profiles obtained from pertur- 
bation experiments in the form of expression data offer the 
potential for obtaining a comprehensive view of cellular functions, 
even though they exhibit a number of limitations as oudined in 
[2] . A key problem is the fact that perturbation experiments only 
provide indirect information on gene interactions, as explained 
below [3]. Further, inferring large scale cellular networks from 
perturbation data is computationally challenging, and only a 
limited number of computational tools have been developed to 
address it. Some approaches are built on clustering of phenotypic 
profiles [4,5], which are based on the reasoning that functionally 
related genes should exhibit similar behavior under perturbations 



and hence cluster together. A tailor-made approach for the 
problem of estimating networks from perturbation data is the 
nested effects models (NEMs) [1,2,6]. NEMs are a special class of 
graphical models originally introduced to uncover the hierarchies 
among transcription factors based on observations of affected 
genes. More recently, NEMs have been extended to reconstruct 
regulatory networks by taking advantage of the nested structure of 
the observed perturbation effects, where for computational 
efficiency purposes, triplets of genes are used to assemble the 
global regulatory network. Extensions of this method that capture 
temporal effects by using perturbation time series measurements 
are described in [7,8]. In response to the reconstruction problems 
presented in the DREAM challenges (see more information in the 
Results section), methods like feed-forward loop down ranking 
(FFLDR) [9] and a Z-test based method coupled with ordinary 
differential equations to model temporal changes in expression 
data (Inferelator) [10] were also developed. 

The computational difficulty of reconstructing a network from 
data, alluded to above, stems from the fact that in order to capture 
the regulatory interactions one should consider all possible 
orderings of genes in the network (so that parent nodes influence 
child ones) and score the resulting network structures accordingly. 
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II. Use penalized likelihood to estimate 
a DAG for each ordering 




III. Determine the "consensus" graph 
by model averaging 




Figure 1 . Overview and details of tlie RIPE metKiod. (A) Overview of the RIPE algorithm. (B) First step to obtain a large set of causal orderings 
from a network with cycles. The network graph is decomposed into strongly connected components, so called super-nodes, (left) followed by a DPS 
on the strongly connected components (right). For example, the post-visit time of super-node A is 8, and thus A precedes all other nodes. The 
topological ordering of super-nodes is A-<B-<C<D. (C) Illustration of MC-DFS algorithm. Gene perturbation graph (left), DFS visit times for labeling 
#1 (middle), and DFS visit times for labeling #2 (right). (D) Depiction of a small network to illustrate the influence graph and the predictors used in 
the penalized likelihood estimation procedure. The true regulatory network (left), the influence graph under no noise (middle), observed influence 
graph, with false positive and false negative edges (right). Edges in the regulatory networks are shown in thick lines and additional edges in the 
influence graph are distinguished with narrow lines; red dash lines indicate false positives. 
doi:1 0.1 371 /journal.pone.0082393.g001 
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The computational complexity of identifying aU possible orderings 
of a set of nodes in a directed graph is exponential in the size of the 
graph. Hence, the approach based on nested effects models 
employs several heuristics for searching the space of orderings. 
Similarly, [1 1] employs a Markov Chain Monte Carlo based 
search method and subsequent scoring of the resulting network 
structures. 

Another set of approaches solely utilizes observational gene 
expression data that capture the system in steady state. A major 
technical tool for such reconstructions is graphical models [12] 
that encode a probability model over the genes through the 
underlying network. Over the last few years, a number of 
algorithms have been proposed in the literature for the estimation 
(reconstruction) of primarily Gaussian graphical models under the 
assumption of a sparse underlying structure (see [13,14] for a 
discussion and references therein). The main shortcoming of these 
approaches is that graphical models for obserx^ational data are 
mostiy capable of identifying dependencies between genes, rather 
than causal relations representing regulatory mechanisms. Further, 
the presence of more genes than available samples usually leads to 
very sparse reconstructions. It should be noted that recent work is 
geared towards identifying causal effects from observational data 
by employing the concept of intervention calculus [15]. Also 
utilizing only steady state gene expression data is a method 
implemented in the PC-algorithm [16] that starts from a complete 
undirected graph and recursively sparsifies it based on conditional 
independence decisions; directionality can only be inferred for a 
subset of edges (due to the issue of observational equivalence [12]) 
and is added as a post-processing step. 

Time course gene expression data have also been used to 
estimate gene regulatory networks, using two classes of models. In 
the first approach, called Granger causality, the predictive effect of 
genes on each other are used to estimate regulatory relationships 
among genes [17,18]. In the second approach, known as dynamic 
Bayesian networks (DBNs), the framework of Bayesian networks is 
extended to incorporate biological networks with feedback loops 
[14,19,20]. More recentiy, penalization methods have been 
applied to improve the estimation of high-dimensional regulatory 
networks from small sample sizes [21-24]. 

The proposed computational approach in this study utilizes 
both gene expression data obtained from perturbation experi- 
ments, and an independent expression data set reflecting steady 
state behavior of the cell. The main steps are summarized in 
Figure lA. Specifically, based on the perturbation phenotyping 
data, we obtain a (large) set of causal orderings of the genes 
through a fast search algorithm which samples from the space of 
all possible orderings (see section Obtaining Causal Orderings 
from Perturbation Data for the definition of a causal ordering and 
algorithmic details). These orderings correspond to the inherent 
layering of nodes of the graph. Each set of orderings is then 
employed to obtain a directed acyclic regulatory graph/network 
(DAG) using the independent gene expression data set through an 
extension of a fast penalized likelihood method introduced in [13]. 
Further, the likelihood of every estimated graph is Ccdculated. 
Finally, a consensus regulatory network (which can very well contain 
cycles when the true network is cyclic) is obtained by averaging a 
small set of the most likely DAGs obtained. The advantage of the 
proposed algorithm is that it utihzes both perturbation and steady 
state expression data, and once the set of gene orderings is 
determined, a theoretically rigorous and computationally fast 
likelihood-based method for estimating the underlying network 
structure is used. Further, causal orderings are determined by 
searching through the entire set of orderings consistent with the 
perturbation data, thus taking a global perspective, as opposed to 



competing methods (e.g. the nested effects model or the PC- 
algorithm) that only utilize information pertaining to direct 
neighbors of nodes and then assembling the network from such 
local estimates. 

The proposed methodology has a number of additional 
advantages over competing methods. First, the underlying 
assumption in the estimation of regulatory networks from 
perturbation data is that the structure of the network does not 
change in different knockout experiments, and therefore observa- 
tions from these experiments can be combined in order to estimate 
the underlying regulatory network. However, such an assumption 
is not fully valid: biological systems and regulatory networks are 
highly robust [25,26], which is believed to be the result of 
redundant regulatory mechanisms, and knocking out a single gene 
may trigger an alternate regulatory pathway, resulting in a 
different network structure. On the other hand, the main 
assumption of the methodology proposed in this paper is that 
the causal orderings of genes remain stable in different perturba- 
tion experiments, which is significantiy less restrictive. Second, 
although the problem of reconstructing regulatory networks is 
computationally NP-hard, the computational complexity of the 
approximate algorithm proposed in this paper is considerably 
lower than most methods of network reconstructions directly from 
perturbation data. This is mainly due to the following facts: (i) the 
space of possible orderings is smaller than the space of graphs 
(even acyclic ones) [11], and (ii) by employing the Monte Carlo 
sampling framework over the space of orderings, the approximate 
algorithm offers a tractable alternative in high dimensional 
settings. Our extensive experimental results indicate that the 
algorithm does not require an exhaustive search over the space of 
orderings, and a much smaller set of orderings often results in 
significant improvements over completing methods. Finally, it is 
known that perturbation data in the form of single gene knockouts 
do not provide sufficient information for estimation of regulatory 
iict\vorks and, in theory, all possible knockout combinations may 
be needed to fully discover a regulatory network. On the other 
hand, the indirect information in perturbation experiments is 
sufficient for estimating the causal orderings (see the subsection 
Obtaining Causal Orderings from Perturbation Data under 
Methods). Therefore, by breaking down the nct^vork reconstruc- 
tion problem into three steps, not only can we achieve better 
computational complexity, but we can also transform a non- 
tractable problem into a sequence of tractable sub-problems. 
Further, the proposed approach offers the possibility to gain 
insight into the informational contributions of the two data sources 
used, and offers improved performance through systematic 
integration of the two sources of data. 

Methods 

Method Overview 

We introduce next a three-step algorithm, called RIPE (standing 
for Regulatory network Inference from joint Perturbation and 
Expression data), which incorporates both perturbation screens 
from knockout/knockdown experiments, as well as gene expres- 
sion data usually reflecting steady state behavior of the cell, in 
order to address the problem of estimation of regulatory networks. 
We adopt the term steadji-state data in our presentation for wild- 
type data. The data obtained from perturbation (knock-out/ down) 
experiments are referred to as perturbation data. Note though that in 
practice, the actual "lab measurements" of the knockout effects are 
also considered in an equilibrium state. However, for presentation 
clarity, we differentiate the two above-mentioned sources of data. 
The main steps for RIPE are illustrated in the flowchart of 
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Figure lA. In the first step, the data fi'om the perturbation screens 
are employed to obtain a large collection of causal orderings. In 
the second step, each causal ordering is used in conjunction with 
the steady state data to obtain an estimated regulatory network 
through a penalized likelihood approach. Finally, in the third step 
a consensus network is constructed from the best networks 
obtained in the second step. A detailed description of these three 
steps follows. 

Obtaining Causal Orderings from Perturbation Data 

Estimating the Influence Matrix from Perturbation 

Data. Let V denote the binary influence matrix of size ky.p, with 
p representing the number of genes under study and k the number 
of single genes that are knocked out/ down. In general, the RIPE 
algorithm requires "perturbation" of individual genes. These 
perturbations can be in the form of knockouts and/ or knockdowns 
as illustrated in the the section DREAM4 Challenge under 
Results. Thus, all references to knockout experiments throughout 
the text also include single gene knockdowns. Each knockout 
experiment is repeated times, and captures the effect of the 
knockout gene on the remaining genes. Also required are 
observations on the unperturbed network (repeated «o times), to 
determine the baseline expression of the p genes. 

An entry in the influence matrix Viij) = I if the knockout 
experiment for gene / is affecting gene j, and 0 otherwise. To assess 
whether an entry is non-zero, meaning that gene j is differentially 
expressed in the knockout experiment of gene i compared to its 
baseline expression, we can use e.g. a (moderated) 2-sample ?-test. 
Applying different cutoffs to the p-values obtained from such an 
analysis results in different number of non-zero entries in V. In 
Section Determining the Influence Matrix under Results, we 
describe a systematic procedure for determining the appropriate p- 
value cutoff It is therefore clear that as the quality of the 
perturbation experiments improves, and/or number of replicates 
for each perturbation experiments increases, fewer false positive/ 
negative edges are present in the V. Well-conducted perturbation 
experiments with «, in the range of 2 to 5 replicates often provide 
the required level of accuracy. 

From matrix V one can obtain the directed influence graph 
Gp( V,E) of P, where V is the set of vertices and E the set of the 
graph edges. The graph Gp contains an edge from node / to node j 
if the corresponding entry V(iJ) = 1 . 

It is worth noting that the RIPE algorithm can also be used in 
the case where k<p, which is often of interest in the setting of 
regulatory networks. In this setting k is the number of transcription 
factors (TFs), and p the total number of genes under study. 
Assuming that there are no directed edges from target genes to 
TFs, causal orderings are then found only from TFs to other TFs 
as well as target genes. To simplify the exposition of statistical 
models, we discuss the details of the algorithm in the setting where 
k=p, and defer the case of k<p to Section Regulatory Network 
in Yeast, where we illustrate the application of RIPE in such a 
setting by estimating the gene regulatory network of yeast with 
A: = 269 perturbations on transcription factors, and p = 6051 total 
genes. 

Obtaining Causal Orderings based on the Influence 
Matrix. A critical step in the proposed approach is to obtain 
causal orderings of the genes from the influence matrix V. In case 
the influence matrix V describes "acyclic" causal effects (i.e, the 
influence graph of V is acyclic), a single ordering of the underlying 
graph would suffice. However, most likely the perturbation matrix 
contains cyclic causal effects, due to the presence of feedback 
mechanisms in the regulatory network. Further, perturbation 
experiments usually yield noisy data, which could also result in 



cycles in the influence matrix, even if the underlying network is 
acyclic. Hence, one usually deals with an influence matrix whose 
underlying graph contains cycles and an individual causal ordering 
is not sufficient. We discuss next how one can obtain a set of causal 
orderings from the influence graph G-p. 

We start by providing key definitions for a linear ordering of a set, 
topological ordering of an acyclic directed graph, and causal ordering of 
a directed graph. A linear ordering of the elements of a set is a 
collection of ordered pairs of the set elements, such that the ordering 
of each pair satisfies a certain criterion. In graph theory, an 
example of a linear ordering is a topological ordering. A topological 
ordering (known also as topological sort [27]) of a directed acyclic 
graph Gj> comprising of p nodes, corresponds to a linear list of the 
form {xi,X2, ■ ■ . ,Xp) of the nodes {I, . . . ,p}, with X; denoting the 
label of the node in the /-th position in the ordered list. The 
ordering adheres to all partial relations i^j implied by the graph 
G-p, where the relation i^j is interpreted as "node / precedes node 
/', i.e. there is an acyclic path from node / to node j. We define the 
causal ordering of a directed graph to be the linear listing of nodes 
that corresponds to a valid depth-first search traversal of the graph, 
defined next. Before proceeding to the description of our 
algorithms, we want to emphasize a distinction between the terms 
topological sort and causal ordering. In acyclic graphs, a topological 
sorting is a special case of a causal ordering. In other words, the 
former refers to graphs with no cycles; the latter refers to linear 
orderings of causal effects induced by the influence graph, and are 
obtained from graphs that potentially have cycles. The difference 
is that in the latter ordering, not all partial relations of the form 
i<j hold. 

A standard method in graph theory for obtaining a topological 
ordering of an acyclic graph is by employing the depth-first search 
(DFS) algorithm [27,28] which is oudined in Figure 2. When the 
exact same algorithm is utilized in graphs with cycles, a causal 
ordering is obtained. DFS is a graph traversal algorithm. It 
"searches" the graph by traversing it "in-depth". This means that 
when a node is discovered, the algorithm wiU continue searching 
for undiscovered nodes adjacent to the current one. When the 
algorithm reaches a "dead-end", it backtracks until it finds 
previously visited nodes that have undiscovered neighboring 
nodes, and if this fails as well, it proceeds to new nodes not yet 
visited. The complete details of the algorithm are shown in 
Figure 2, where a recursive implementation of the method is given 
(similar to the one in [27]). Note that the algorithm saves the time 
of the first discovery of the node (pre-visit time) and the time of 
final departure of the node (post-visit time). Final departure for, 
say, node / occurs when there are no paths starting from ; and 
leading to undiscovered nodes. The ordering is readily acquired by 
a descending sort of the post-visit times, as shown in the right 
panel of Figure IB. 

For cyclic graphs, a mere application of DFS does not produce 
sufficient information to help us reconstruct the regulatory 
network. We would like to obtain as many causal orderings as 
possible, and evaluate each individual causal ordering using our 
penalized HkelUiood method described in the next section. To 
tackle the problem of obtaining a large set of causal orderings in 
graphs with cycles, we need to introduce the following two steps. 

• In step 1, we decompose the graph into its connected 
components. Specifically, a strongly connected component [27] is a 
subgraph, such that there exists a path from every node to any 
other node in the subgraph. Hence, if we collapse each 
strongly connected component into a single super-node, the 
resulting graph is a DAG (see Figure IB, left panel). We then 
produce a topological sorting of the super-nodes. Note that since 
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Algorithm for DFS 

Input: Graph G(V, B) 

Output: Arrays pre, post of pre-visit and post-visit times 
clocks 1 (Graph traversal counter} 
for all D e y do 

visited(t;) = false 
end for 

for all J) G y do 

if not visited(i;) then 
explore(?;) 

end if 
end for 

Procedure explore(?;) {Find all nodes reachable from v} 
visited(ii) = true 

pre [v] = clock {Mark the time of first discovery of u} 

clock = clock + 1 

for each edge (u, u) a E do 

if not visited(ii) then 
explore(M) 

end if 
end for 

post [v] = clock {Mark the time of final departure from v} 
clock = clock + 1 



Figure 2. Algorithm for DFS. Detailed steps of the algorithm for the DFS search. 
doi:1 0.1 371 /journal.pone.0082393.g002 



the super-nodes form a DAG, a mere topological sort is 
sufficient (see Figure IB, right panel). 
• In step 2, we produce a set of causal orderings for each super- 
node. Recall that in the presence of cycles, a causal ordering is 
not a topological ordering; instead, it is a linear listing of nodes 
that arises from a graph traversal using DFS. We propose two 
methods to obtain a set of causal orderings, each described in 
the sequel. The first involves an exhaustive DFS search and 
can be employed on strong components with relatively few 
nodes; the other couples DFS with Monte Carlo sampling and 
is suitable for large size strong components. 

After completion of the above steps we combine the set of 
orderings corresponding to each super-node and thus we obtain 
the "universe" of causal orderings imposed by the influence matrix 
V. For the network graph depicted in Figure IB, the universe of 
orderings includes l^{5^6}-<{2^3^4}^7, l-<{5<6}^ 
{4<:2-<3}^7, l^{5-<6}^{3^4^2}^7, l<{6-<5}-<{2-<3 
^4}^7, l^{6<5}<{4^2^3}-<7, l-<{6<5}^{3-<4-<2} 
<1. 

In the exhaustive DFS approach, we modify and extend 
Knuth's backtracking algorithm [29] that was proposed for 
generating all topological sorting arrangements of a DAG. The 
exhaustive search procedure is initiated at every node of the 
strongly connected component (thus, a parallel/concurrent 
implementation of this is feasible). This ensures that all possible 
causal orderings wiU be considered. Now, suppose that our search 
method has just discovered a new node, say node j. A key idea that 
ensures that all paths initiated from that node will be accounted, is 
to save all adjacent nodes of the newly visited node in a circular 
list. Suppose the Kst of node j contains nodes {i\,i2,---,ik}- Then, 
DFS proceeds exactly as discussed above, i.e., it traverses the 
graph "in-depth", starting from i\. After all paths originating from 
node i\ are visited, the algorithms backtracks to j, and then the 
alternative paths will be explored by consulting our circular list 
(i.e., paths starting at i2, etc). However, exhaustively searching via 



backtracking has exponential complexity due to the huge number 
of combinations of paths that DFS can take. This makes the 
method practically infeasible for relatively large strongly connected 
components (e.g. larger than ~ 10). Nevertheless, it represents a 
useful tool for obtaining the universe of orderings in small size 
components. 

For large size strong components, we develop a fast approximation 
algorithm, named MC-DFS, that incorporates ideas from Monte 
Carlo sampling techniques. MC-DFS consists of two simple steps: 
first, it employs a random labeling of the graph nodes; then, it runs 
DFS based on the current labeling. The workings of the MC-DFS 
heuristic algorithm can be best demonstrated - for in = 2 label 
permutations - with the example of Figure IC. Suppose that the 
influence graph given by matrix V is the one depicted on the 
leftmost panel of the figure. The middle panel depicts labeling I 
and the rightmost panel shows labeling 2. The DFS post- and pre- 
visit times for both cases are as shown in the figure. Given labeling 
1, DFS produces the following causal ordering: G2<!G1 
^G4^G8^G3<G6<G5^G7. Given labeling 2, DFS produc- 
es: G4^G8-<G3^G6^G2^G1^G5^G7. Thus, a large set of 
random permutations of the node labels, followed by an 
application of the DFS algorithm allows us to obtain a significant 
number of causal orderings and efficiently sample the space of all 
possible orderings. Obviously, the quality of the sampled space of 
orderings depends on the number of label permutations used (for 
some empirical assessment see the Results and Discussion 
sections). The complexity of DFS is 0{\ V\ + \E\) for a graph with 
\V\ nodes and \E\ edges. Thus, the total complexity of MC-DFS 
for a strong component of r nodes and e edges is 0{mr + me), 
should one decide to generate m permutations. 

Estimation of Network Structure Using Gene Expression 
Data 

As mentioned above, a gene regulatory network can be 
represented by a directed graph, whose adjacency matrix is 
denoted by A. The element Aj j = \ if gene i is direedy regulated 
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by gene j, and Aj^i = 0 otherwise. In the setting of graphical models 
[12], the nodes of the graph represent random variables 
X\,. . . ,Xp and the edges capture associations between them. 

It was shown in [13], that if the underlying network is a DAG 
and an ordering of its nodes is known, then estimating the network 
reduces to estimating its skeleton, a significantly simpler compu- 
tational problem. Specifically, the causal effects of random 
variables in a DAG can be explained using structural equation models 
(SEM) [12]. In the setting \vlK'rc' the data arc' normally distributed, 
SEM's can be represented based on linear functions explaining the 
relationship between each node and the set of its parents in the 
network: 



Xi = ^ QjiXj + Zi, i=\,. 



(1) 



Here, pa, denotes the set q{ parents of node /, and Zi% axe latent 
variables representing the variation in each node unexplained by 
its parents (for normally distributed data, Z, ~A/'(0,o^)). Finally, 
the coefficients Oji for the linear regression model represent the 
effut of gene / on i for y'epa,-. 

In the case of cyclic graphs, the above SEM representation for 
DAGs is not directly applicable. However, for each causal 
ordering from the previous section, the nodes of the graph can 
be reordered to obtain a DAG. Hence, using the above 
representation, the problem of estimating the structure of the 
DAGs corresponding to a causal ordering of nodes, say o, can be 
posed as a penalized likelihood estimation prohl(;m as shown in 
[13]. In particular, let X be the nxp matrix of gene expression 
data, whose columns have been re-arranged according to the 
causal ordering o, and denote by Xj the submatrix obtained by 
columns of X indexed by set /. Then as shown in [13], the 
estimate of the adjacency matrix of DAGs under the general 
weighted lasso (or (.\) penalty, is found by solving the following i\- 
regularized least squares problems for i = l,...,p 



A,,i_,j = argm\n{n-'\\Xi-e''Xu-x\\l + My^\ej\wji} (2) 

where denotes the first i— 1 elements of the /th column of 

the adjacency matrix and 2,- is the tuning parameter for each lasso 
regression problem. In Section Choice of Parameters and 
Properties of the Algorithm, we discuss the choice of this tuning 
parameter. Finally, Wji represents the weights of the lasso method; 
for the regular lasso penalty used here, vty, = 1 . In the RIPE 
algorithm, a natural extension of [13] is to use the above penalized 
likelihood estimation framework in order to estimate a DAG for 
each ordering oeO, where O is the set of orderings found from the 
perturbation data. However, the perturbation data provide 
additional information regarding the influence of the genes in 
the network. In particular, in the absence of noise, the set of 
parents of each gene in the regulatory network are a subset of the 
set of parents in the influence graph. Using this observation, we 
generalize the penalized estimation problem in (2) to limit the set 
of variables in each penalized regression to those of the parents of 
node i in the influence graph, consistent with each ordering, which 
equates the set of all ancestors of i in the regulatory network. In 
other words, for each ordering o, the set of edges pointing to each 
gene in the regulatory graph is estimated by solving the following 
£i -regularized regression (lasso) problem: 



argmin {n 



l + MO\\i} 



(3) 



where /„ = pa, " denotes the set of parents of i in the influence 
graph consistent with ordering o and \\0\\i is the £i norm of 0. 

To illustrate the optimization problem for estimation of DAGs 
in the second step of RIPE, consider the regulatory network in the 
left panel of Figure ID. The middle panel of the figure represents 
the ideal influence matrix, obtained when no errors are present in 
the perturbation data, and the right panel represents a realization 
of the influence graph with both false positive and false negative 
edges. In the first step of RIPE, causal orderings are determined 
based on the graph in the right panel of Figure ID. An example of 
such an ordering is 0 = (g2-^g\'^g3,-Kg4-Kg5)- Note that in this 
case many orderings exist, due to the presence of cycles in the 
influence graph. In the second step of the RIPE algorithm, a 
penalized regression problem is solved for each node, where the set 
of predictors are the set of parents of the node in the right panel of 
Figure ID, consistent with the given ordering. In particular, the 
folknving regression problems are solved for o (here Xj~ Xj + Xi^ 
denotes regression of Xj on Xi and X/c, ignoring for ease of 
presentation the corresponding penalty term): 

Xi X2, Xt, ~ X\ -\- X2, X4~ X\-\- X2, X^~ X2 + Xt,-\- X4 

Using the results of these regressions, the value of the penalized 
negative log-likelihood function is then determined for each of the 

estimated graphs. Based on these values, the "best" networks are 
then used to construct the consensus graph, as described next in 
the section discussing the third step of the RIPE algorithm. 

Given the set of orderings O of cardinality M=\0\, one needs 
to solve M separate penalized regression problems, and store their 
corresponding penalized negative log-likelihood values, where the 
computational cost of each of these problems is 0(np^). However, 
this step is fully parallelizable and using v processors the 
complexity reduces to 0{^niP'). 

Graph averaging: a consensus regulatory network 

As mentioned above, the estimated influence matrix V from 
perturbation screens results in multiple orderings o, either due to 
feedback regulatory mechanisms or noisy measurements. In such 
cases, the estimate of the adjacency matrix of the graph can be 
obtained from those corresponding to the set of orderings 
achieving the smallest penalized negative log-likelihood values. 
Therefore, the final step of the RIPE algorithm includes a model 
averaging procedure that combines the estimated DAGs from 
multiple orderings to construct a cyclic consensus network. This is 
illustrated with a small cyclic subnetwork in Figure SI. In this 
example, the true network includes a number of cycles, and the 
estimate from the RIPE algorithm correctiy identifies some of 
these cycles (also see Text SI). 

Let Lq denote the lower qih quantile of the penalized negative 
log-likelihood values, denoted by t, and Q = {oeO : ((o)<Lq} be 
the set of orderings with the lowest 100g% penalized negative log- 
likelihood values. The RIPE estimate of the adjacency matrix is 
then defined as the consensus DAG: 



1 



I keQ ''I 



(4) 



Here, is the DAG estimate from the fc-th ordering. A' denotes 
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the confidence of each edge in the final estimate, and \ci is the 
indicator fiinction which equals 1 when the condition Q, holds, and 
0 otherwise. 

Consequently, the estimate of the edge set of the graph is 
defined as E = {{ij) : A'ij>x}, where % is the threshold for 
including an edge. The value of t determines the desired level of 
confidence, and can be chosen by the user depending on the 
application of interest. Nevertheless, the above formulation 
provides a flexible estimation framework, in which T is considered 
a tuning parameter or can be set based on prior analyses (see the 
section Choice of Parameters and Properties of the Algorithm 
below and the Results section for more details about the choice of 
X). ^ 

The RIPE algorithm also produces an estimate of the sign of 
each edge, as well as the magnitude of the effect, defined by the 
matrices and below: 

teg 1^1 tee 

where the sgn(0) = 0. 

Choice of Parameters and Properties of tfie Algoritlim 

Similarly to any other learning algorithm, the performance of 
the RIPE algorithm depends on the choices of tuning parameters. 
There are three tuning parameter that need to be determined: (i) 
the penalty coefficient I, (ii) the likelihood quantile q, and (iii) the 
confidence threshold x. Next, we discuss strategies for choosing 
these parameters. 

In penalized regression settings, A determines the weight of the 
penalty term in the optimization problem, with larger values of X 
resulting in more sparse estimates. In [13], the following error- 
based choice of X was proposed for the i-th regression in (2), 

/l,:(a) = 2»-'/^ Z* „ . (6) 

ip(i-\) 

Here, is the x-quantile of the standard normal distribution and 
it can be shown that this choice of penalty controls the probability 
of falsely joining two unconnected ancestral components at level a. 
Interestingly, numerical studies in [13] show that the result of the 
analysis is not sensitive to the choice of a, and values of ae(0.1,0.5) 
result in comparable estimates. 

Even though the choice of /. in equation (6) controls the 
probabihty of false positives, it results in over-sparse estimates. 
Numerical studies in [24] strongly suggest the smaller value 
0.6A,(a), which is used in the numerical studies in this paper. 

Unlike the choice of A, our numerical studies indicate that the 
RIPE algorithm is not sensitive to the choices of q and T, and a 
wide range of values can be used for these parameters (see Results 
and Figures S2, S3, S4). It is worth pointing out that since the 
value otq determines the proportion of highest likelihoods used in 
constructing the consensus graph, when the perturbation data 
gives a reliable estimate of the influence matrix V, multiple 
orderings from the first step of RIPE would result in true feedback 
cycles in regulatory networks, and therefore, the differences in 
corresponding likelihood values are mostly due to the inherent 
noise in expression data. As a result, in the ideal case with no 
errors in the influence graph, a value of ^ = 1 would produce the 
"best" estimate. However, in practice, to avoid inferior estimates, 
outlier values of the likelihood should not be incorporated in the 
estimate. Our numerical analyses show that values of ^6(0. 1,0.9) 
would result in comparable estimates. 



Finally, the choice of t determines the confidence of edges in the 
estimated consensus network: large values of t result in edges that 
are more consistently present in all estimated graphs, while small 
values allow for less frequently present edges to be included in the 
final estimate. As with k, our numerical studies indicate that large 
values of t result in over-sparse estimates, and we recommend 
values of T€(0.05,0.35) (see the Results section and Figures S2, S3, 
S4). All estimates in the paper were obtained by setting q = Q. \ and 
T = 0.25, to achieve a balance between the standard performance 
measures of Precision and Recall. 

To complete our discussion of the RIPE algorithm, it is worth 
noting that an accurate influence matrix V including binary 
information from single gene knockouts contains sufficient 
information for obtaining the causal orderings of the underlying 
regulator)' network, but not its structure (see Lemma 1 in the 
Supporting Information (Text SI) for a formal statement and 
proof of this result). In the RIPE algorithm, abundantiy available 
steady-state expression data are used to compensate for this lack of 
information. Further, we estabUsh the asymptotic consistency of 
the network structure estimated using the penalized likelihood 
method (see Lemma 2 in the Supporting Information, Text SI). 

Results 

Preliminaries 

Determining the Influence Matrix. In Section Estimating 
the Influence Matrix from Perturbation Data under Methods we 

describe how to estimate an influence matrix V from perturbation 
experiments using differential expression analysis. The analysis 
produces p-values for each entry in the matrix, and by choosing a 
specific cutoff />*, we get an estimate of note that a lower p* 
gives rise to a sparser matrix. To select a reasonable cutoff, the 
number of edges in the influence graph for different p-values are 
plotted together with the size of the largest connected component, 
see Figure 3. If the network is modularized, i.e. to some extent 
consists of nodes grouped together as illustrated in Figure IB, we 
would consider a drop in the size of the largest connected 
component indicative of a good choice of p-value. If only a few 
edges make the difference between a large and a small connected 
comp()n('iit, \vc' haw most likc'ly found the y;-\-aluc for which the 
"noise edges" have been removed to reveal the modularized 
structure. If such a drop in the size of the large component is 
missing, possible reasons can be that the underlying network does 
not exhibit a modularized structure, or that the signal strength of 
the experimental data is rather low to clearly reveal it. However, a 
plot of the t)'pe given in Figure 3 can still be useful in such a 
setting, since we can choose a p-value based on the size of the 
connected component. An overly large connected component is 
not very realistic from a biological perspective, in addition to being 
computational demanding since it produces many potential 
orderings. 

Performance Evaluation Criteria. Sc\ <"ral evaluation cri- 
teria have been proposed for assessing the performance of dififerent 
network reconstruction methods. However, the choice of criteria 
depends on the information available on the "true" regulatory 
network: in synthetic, or in sUico examples, the gold standard is 
known, and Precision, Recall and F\ measures are often used as 
standard metrics (with F\ being the harmonic mean of the other 
two metrics). On the other hand, in real data applications, the gold 
standard is often unknown and the appropriate performance 
criteria should be determined based on the information at hand. In 
the section below, tided Regulatory Network in Yeast, the known 
protein-DNA interactions for yeast present in the BIOGRID 
database (release 3.1.74) [30] are used as gold standards. To 
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Figure 3. Influence graph characteristics versus p-value for DREAM4. Number of edges (open triangles) in the influence graph and the size 
of the largest connected component (dots) versus cut-off /i-value for differential expression. The data is based on five replications of the knockout 
and wild-type experiments for (A) 100-node network 1, (B) 100-node network 3, and (C) 100-node network 5 in the DREAM4 challenge. 
doi:1 0.1 371 /journal.pone.0082393.g003 



account for the fact that the BIOGRID database may not include 
all regulatory interactions, we carried out an experiment similar to 
the approach implemented in [15]. More specifically, to assess the 
significance of the number of true positives for each of the 
estimates, 10,000 Erdos-Renyi random graphs with the same 
number of edges as each of the estimates were generated (using the 
R-package igraph [31]), and the number of true positives in 
randomly generated graphs was used to approximate the /7-value 
for significance of the number of true positives observed. The 
resulting y;-\ alue determines the likelihood of observing a given 
number of true positives in randomly generated graphs, and can 
be used direcdy for performance evaluation. 

DREAM4 Challenge 

To illustrate the performance of the RIPE algorithm, we first 
evaluate it by using networks from the DREAM4 in-sUico network 
challenge, which is part of a series of challenges [32], aiming at 
inferring gene regulatory networks from simulated data. The gold 
standard networks in DREAM4 were generated by extracting 
modules from two source networks: yeast and Escherichia coli. The 
modules were extracted so as to preserve the structural properties 
of the underlying network, such as degree distribution and network 
motifs (statistically overrepresented circuit elements) [33]. The 
underlying dynamics in the models include mRNA and protein 
dynamics simulated by stochastic differential equations with both 
experimental and internal noise added (an extension to the 
method in [34]). 

The simulated data sets available within DREAM4 consist of 
observations from the unperturbed network (wild-type), perturba- 
tion experiments in which all genes are knocked-out one-by-one 
(knockouts), perturbation experiments in which the activity of aJl 
genes are lowered one by one with a factor of two (knockdowns), as 
well as small perturbation of all genes simultaneously (multifactorial) 
and finally, time series. 

Three of the five 100-node networks were selected for analysis; 
network 1 (NetT), network 3 (Net3) and network 5 (Net5). These 
particular choices were made based on the differences in 
topologies, as well as on how well the network structures were 
predicted in terms of AUG (Area Under the ROC Curve) in the 



in-sUico challenge. Specifically, several competing research teams 
submitted predictions of the network structures for the DREAM4 
challenge. Netl was best predicted overall, while the structure of 
Net5 was the most difficult to deduce. We chose to also analyze 
Nets since Networks 2-4 were predicted with comparable 
accuracy, with intermediate AUC values compared to Netl and 
Nets. The varying prediction accuracies can be explained by the 
differences exhibited by the various network structures; for 
example, Netl has a layered structure, with many nodes 
functioning as either parents or children in the graph, while Net5 
has a more complex topology with several short cycles and nodes 
with multiple parents. 

The DREAM4 challenge only includes one replicate of each 
simulated experiment, and in order to assess the noise levels in the 
data, we simulated fi\e replicates of each of the wild-type, 
knockdown, and knockout experiments, as well as one multifac- 
torial data set for the networks of interest by using GeneNetWea- 
ver 3.0. The DREAM4 default settings were used, excluding 
standardization of the simulated data. 

Two methods tied for first place in predicting the topology for 
the 100-node networks in DREAJVI4; a ;-test based method [10] 
and a method based on confidence matrices, pruned by down- 
ranking of feed-forward loops (FFLDR) [9] . As the two methods 
performed comparably, we chose to compare our method to 
FFLDR, for which the authors kindly provided their code. In [10], 
the ?-test method is also combined with time-series data using 
ordinary differential erjuations to model the temporal changes in 
the gene expression (known as Inferelator). However, as this 
method was not submitted in the challenge of network topology 
prediction, and in addition utilizes time series data, we have not 
included it in the comparisons. 

The influence matrix V was estimated by comparing the 
expression levels from perturbation experiments (five replicates for 
each knockout/knockdown) to the corresponding levels in the 
unperturbed network (five replicates of the wild-type data), as 
described in the Preliminaries section above (see Figure 3). The p- 
value 0.019 was selected for both Netl and Net5, while 0.003 was 
chosen for NetS. For the knockdown data, the same method was 
used to select the cutoff p-values as for the knockout data (see 
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Figure S5). For FFLDR, the optimal performance for large-scale 
networks is obtained by directly comparing the expression levels in 
knockout experiments, and hence, the five replicates of the 
knockout data (not using the wild-type data for comparison) were 
averaged, and this signal was used as input to the method. 

Table 1 summarizes the performances of the PC-algorithm 
(PCALG, as implemented in the R-package pcalg), the Nested 
Effects Model (NEM, as implemented in the R-package nem), as 
well as the FFLDR and RIPE (based on 10,000 MC-DFS 
orderings) algorithms in reconstructing Metl, Met3, and Net5. Since 
the DREAM4 challenge does not include suitable steady-state 
expression data, to employ PCALG and RIPE, «= 100 indepen- 
dent samples from a Gaussian distribution with mean 0 and 
variance 1 were generated from each of the networks. The 
strengths of associations among genes were chosen as uniform 
random numbers in the range +(0.2,0.8) and to handle the effect 
of cycles in the regulatory network, the magnitudes of strengths 
were then normahzed according to the method described in [35]. 

As a benchmark, we also evaluated the performance of the 
ARACNE procedure [36] on the simulated data. It is important to 
note that ARACNE estimates undirected networks. Thus, the 
estimates from ARACNE cannot be directly compared with the 
other methods mentioned above. Nonetheless, we applied 
ARACNE with a number of />-value cutoffs and found that the 
Bonferroni adjusted /)-value cutoff of 1.010 x 10^^ offers the best 
performance compared to the true network. 

It can be seen from Table 1 that RIPE outperforms the 
competing methods (see F\ measure), with the exception of 
knockdown data for Metl where FFLDR exhibits a slight edge. The 
differences in performance are more pronounced for the more 
complex Nets. As mentioned above, Met3 and Met5 pose more 
challenging reconstruction problems compared to Metl\ hence, all 
methods exhibit inferior performances in the reconstruction task. 
On the other hand, while knockout data represent ideal 
perturbation experiments, knockdown data correspond to less 
accurate ones. Therefore, the performances of NEM and FFLDR, 
that only employ perturbation screens, as well as RIPE, deteriorate 
in the case of knockdown data, whereas the performance of 
PCALG and ARACNE are not affected by the change in the 
perturbation data, as it uses as input only steady state expression 
data. 

To further evaluate the performance of ARACNE, PCALG and 
RIPE, we also applied these methods to the multifactorial data. As 
described earlier, the multifactorial data set is obtained from non- 
i.i.d observations, which violate the underlying assumption of both 
PCALG and RIPE algorithms. In addition, this data set does not 
correspond to a steady-state setting. Interestingly, the results in 
Table 2 indicate that the performance of all three competitors 



deteriorates by roughly similar factors, which can be attributed to 
the lower quality of this multifactorial data set. On the other hand, 
the results show that even when the steady-state data violate the 
underlying assumptions of the model, the performance of the 
RIPE algorithm is comparable to that of FFLDR, particularly for 
the more complex structure oi Met5. 

These results strongly indicate that combining perturbation 
screens with steady state expression data are beneficial to tiie 
regulatory network reconstruction problem, especially in settings 
where the perturbation data are rather noisy and the steady state 
data exhibit good quality. To better address this issue and obtain a 
deeper insight into the effect of noise on the perturbation data we 
vmdertake a number of experiments based on synthetic data. 

Experiments with Synthetic Data 

To assess the influence of the inputs and steps required by the 
various algorithms, we examine a number of settings both in small 
and large scale networks. 

Small Directed Acyclic Graph. We start our discussion on 
synthetic data witii the toy example illustrated in Figure S6. 
Specifically, we employ a randomly generated DAG of size p = 20 
corresponding to the true regulatory network. To emulate possible 
regulatory mechanisms, the generated DAG includes a number of 
"hub" genes, as well as two genes that are not regulated by any 
other gene. 

To obtain independent expression data {X) consistent with the 
underlying DAG, an association weight of p = 0.8 is assigned to all 
the edges in the graph and the available functions in the R- 
package pcalg are used to generate « = 50 independent samples of 
Gaussian random variables with mean 0 and variance 1 , according 
to structural equation models that incorporate the influence of 
nodes of the graph on each other (see [37] for more details). 

The influence matrix V corresponding to the perturbation data 
is generated as follows: given the adjacency matrix of a DAG A, it 
is shown in [38] that V = [{I — A) where / denotes the 
identity matrix and [-1 the ceiling operator. We denote by Vq the 
ground truth influence matrix corresponding to the generated 
DAG. However, as previously discussed, in practice the influence 
matrix is extracted from gene expression data obtained from the 
perturbation experiments and hence is inherently noisy. Depicted 
in Figure S7 is the matrix Vo (left-most image) and three variants 
of that matrix that we examine; in the first one (Pi) the direction of 
a small proportion of edges reversed (second image from the left) 
and in the second one (P2) edges are added (second image from 
the right); the third matrix (7^3) includes both reversed edges, as 
well as an addition of extra edges (right-most image). 



Table 1. Reconstruction results for DREAM4 networks with simulated steady state data. 





Method 


Net1, KO 




Net}, KD 




Net3, KO 




Nets, KD 




Nets, KO 




Nets. KD 




P 


R 


F, 


P 


R 


F, 


P 


R 


F, 


P 


R 


F, 


P 


R 


Fi 


P 


R 


F, 


ARACNE 


17 


41 


24 


17 


41 


24 


10 


48 


16 


10 


48 


16 


16 


51 


25 


16 


51 


25 


PCALG 


48 


26 


34 


40 


23 


29 


48 


25 


33 


44 


25 


31 


40 


20 


27 


43 


22 


29 


NEM 


28 


34 


31 


6 


6 


6 


14 


9 


11 


4 


4 


4 


9 


17 


12 


7 


10 


08 


FFLDR 


92 


57 


70 


86 


46 


60 


66 


44 


53 


60 


28 


38 


44 


35 


39 


49 


26 


34 


RIPE 


80 


71 


75 


74 


46 


56 


65 


49 


56 


61 


30 


40 


59 


47 


52 


56 


34 


42 



Performance measures, in percentages, for methods of reconstruction of DREAM4 Net!, Net3, and Wef5, using both knocl<out (KO) and knockdown (KD) data. Steady 
state expression data is generated from structural equation models based on the true graph. 
doi:l 0.1 371/journal.pone.0082393.t001 
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Table 2. Reconstruction results for DREAM4 networks with multifactorial data. 





Method 


Netl, KO 




Netl, KD 




/Vef5, KO 




Nets, KD 




Nets, KO 




Nets, KD 




P 


R 


F, 


P 


R 


F, 


P 


R 


F, 


P 


R 


F, 


P 


R 


F, 


P 


R 


F, 


ARACNE 


25 


9 


13 


25 


9 


13 


28 


19 


23 


28 


19 


23 


16 


14 


15 


16 


14 


15 


PCALG 


24 


09 


13 


24 


09 


13 


39 


15 


22 


39 


15 


22 


21 


08 


12 


21 


08 


12 


NEM 


28 


34 


31 


6 


6 


6 


14 


9 


11 


4 


4 


4 


9 


17 


12 


7 


10 


8 


FFLDR 


92 


57 


70 


86 


46 


60 


66 


44 


53 


60 


28 


38 


44 


35 


39 


49 


26 


34 


RIPE 


74 


38 


50 


80 


30 


43 


71 


31 


43 


68 


23 


34 


54 


32 


40 


52 


24 


33 



Performance measures, in percentages, for methods of reconstruction of DREAM4 Netl, Nets, and Wef5, using both l<nockout (KO) and l<nockdown (KD) data. 
Multifactorial data from DREAM4 challenge is used as steady-state expression data. 
doi:l 0.1 371/journal.pone.0082393.t002 



In this case, the small size of the network together with the 
relatively small amount of noise introduced in the influence matrix 
allows us to obtain all possible orderings; specifically, Vi which 
contains the most number of orderings, has 1 2 strongly coimected 
components of small size with a total of 3926 possible orderings, 
and hence can be easily handled with exhaustive search. On the 
other hand, the total number of orderings for Vq, Vi and V2 are 1, 
1, and 2 respectively. However, we emphasize that obtaining the 
causal orderings of large strong components (i.e., larger than 15 
nodes) using the exhaustive method is computationally expensive, 
and the MC-DFS heuristic becomes the only practical choice. 

The results of applying the four methods under consideration to 
each of the input data sets are given in Table 3. Note that NEM 
and FFLDR use only the four influence matrices Vo — "Pj as input, 
while the PCALG only uses the steady state gene expression data 
X; finally, RIPE uses both of them. The numbers in parentheses 
correspond to the standard deviation of the metrics for PCALG 
and RIPE obtained from 100 replications of the steady state gene 
expression data X. The numbers reported for RIPE, are obtained 
by considering all possible orderings generated using the 
exhaustive DFS algorithm for Vo — Vi , and for the consensus 
graph was obtained by setting q = Q.\ and T = 0.25. A graphical 
summary of F\ measures over different versions of the influence 
matrbc is given in Figure 4A. It is worth noting that as discussed 
above, the choices oiq and confidence threshold t do not critically 
affect the performance of RIPE and values of (?e(0.1,0.9) and 
Te(0.05,0.6) yield comparable results (see Figures S2, S3, S4). 

It can be seen that by combining perturbation screens and 
steady state gene expression data, RIPE clearly outperforms the FFL- 
DR, NEM and PCALG methods that use a single source of data. Note 
that PCALG only uses the X data, thus the identical entries in the 
table. It is worth noting that although the performances of NEM, 
FFLDR and RIPE are affected by the increasing level of noise in 
the perturbation data (as expected), RIPE can compensate for this 
loss of accuracy by incorporating the additional information from 
the steady-state data. Obviously, all these effects would be 
magnified if a large number of spurious edges are added or a 
large number of true edges are deleted, thus introducing a 
significant amount of noise in the influence matrix V. Interest- 
ingly, our results indicate that PCALG has a slight edge over NEM 
(both in case of DREAM data and the synthetic network). This 
may be due to the specific structure of the DAG under 
consideration (experiments show that NEM, which uses triplets 
of nodes to determine the order of the edges, often performs better 
in chain-type graphs, for example in DREAM4 Metl). The inferior 
performance of NEM in this setting can also be attributed to the 
fact that NEM, as originally proposed, performs well settings 
where the number of perturbation experiments is significantly 



smaller than the number of affected genes, which is not the case in 
our numerical experiments. On the other hand, RIPE takes a 
global view by constructing causal orderings, in addition to 
independently evaluating them with steady state gene expression 
data. Of particular interest is the significant deterioration in the 
performance of FFLDR in the cases of V2 and Vt, that indicates a 
vulnerability of the method in the presence of noise in its input 
data. 

Finally, to assess the effect of approximation used in MC-DFS, 
in comparison to having the universe of orderings, we estimated 
the regulatory network from with different number of 
orderings. The average values of the Precision, Recall and F\ 
measures over 1 00 replications for different number of orderings 
considered are given in Table 4. All estimates were obtained by 
using q = Q.\ and T = 0.25 for estimating the consensus graph. It 
can be seen that, in comparison to the estimate obtained by 
evaluating all 3926 ordering using exhaustive DFS, the one 
obtained by considering a random subset of a small number of 
orderings provides comparable results. In particular, aside from a 
slight increase in recall, as the number of orderings increases, the 
performance of the method with only 200 orderings obtained 
using MC-DFS is comparable to utilizing the universe of orderings 
generated from an exhaustive search with DFS. This result 
suggests that the MC-DFS algorithm represents a viable alterna- 
tive for settings involving a large size strongly connected 
component, and as a result, the RIPE algorithm is not highly 
sensitive to the number of orderings used to reconstruct the 
network. 

Effect of False Positive and Negative Errors in 
Perturbation Data. To analyze the effect of false positive and 
negative errors in the perturbation data on the performance of the 
RIPE algorithm, a randomly generated DAG of size />= 100 was 
used as the true regulatory network. The influence matrix V 
corresponding to the true perturbation data, and independent 
expression data of size n=\ 00 consistent with the underlying 
DAG were generated according to the methods explained in the 
previous section. 

To emulate the effect of false positive and negative errors, three 
different scenarios were considered: false positive errors [FP), in 
the form of edges not present in the true influence graph, false 
negative edges [FN), in the form of missing edges compared to the 
true graph, and both false positive and false negative edges 
[FP -\- FN). Additionally, to assess the effect of increasing noise 
levels, 3 levels of noise in each of the above scenarios were 
considered. To determine the appropriate noise levels, the number 
of edges in Vq (198) were used as calibration, and the proportion of 
false positive and false negative edges were adjusted so that each 
randomly perturbed influence matrix included the same number 
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Table 3. Reconstruction results for synthetic networks subject to error in the perturbation data. 





Method 


Po 






Pi 






P2 






P3 






P 


R 


Pi 


P 


R 




P 


R 


fi 


P 


R 


F^ 


PCALG 


54(8) 


52(10) 


53(8) 


54(8) 


54(8) 


52(10) 


53(8) 


52(10) 


53(8) 


54(8) 


52(10) 


53(8) 


NEM 


50 


33 


40 


55 


33 


39 


36 


33 


41 


21 


28 


24 


FFLDR 


100 


67 


79 


40 


92 


67 


77 


33 


36 


33 


28 


30 


RIPE 


95(4) 


99(2) 


97(2) 


79(4) 


90(4) 


94(1) 


92(3) 


100(1) 


88(3) 


61(5) 


84(6) 


71(5) 



Average performances measures, in percentages, for methods of network reconstruction in the synthetic networks. Numbers in parentheses show standard deviations 
for methods based on simulated steady state data (PCALG and RIPE). indicates the ideal influence graph and Vi to 7^3 represent noisy versions of the influence graph 
(see Figure S7). 
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of expected false edges. The results were 9 noisy influence graphs 
under each of FP, FN and FP + FN settings with roughly 25, 50 
and 75 false edges. 

Considering the inefficiency of NEM for network estimation in 
high dimensional settings especially when the number of effect 
genes is small, we focus on the performance of FFLDR and RIPE, 
using PCALG as a benchmark. The results for RIPE were 
obtained based on 1000 randomly generated orderings for the 
cases where the influence graphs were generated with false 
positives (since the true graph is a DAG, the case of FN amounts 
to a single ordering). 

The performance of the above methods in each of the input 
data sets are given in Table 5, where the results for RIPE and 
PCALG correspond to averages over 50 independent draws of the 
steady state data. Figure 4B summarizes the values of Fi measures 
for diflerent methods. These results reveal a number of interesting 
aspects of FFLDR and RIPE algorithms. First, as expected, 
increasing levels of false negatives impact the performance of 
RIPE, while FFLDR could be severely impacted by high false 
positive rates. Secondly, while the worst-case performance of RIPE 
(for the case of FN) matches that of FFLDR, the proposed data 
integration framework can result in significant gain in network 
estimation accuracy in other scenarios. While high quality 
perturbation screens greatly improve the performance of both, 
by combining steady state data and perturbation screens, the 
RIPE algorithm could compensate for the inaccuracy of the 
perturbation data, whereas estimation based on perturbation data 
alone can result in FFLDR estimates that are inferior to those of 
PCALG. Finally, Figure S8 shows the improvement with 
increasing number of orderings in Fi, P, and R for inference 
using the influence graph with highest level of false positives 
(1 .5%). It can be seen that although the three measures moderately 
improve with higher number of orderings, a small number of MC- 
DFS orderings are sufficient for acceptable performance of the 
RIPE algorithm. Note that due to the acyclicity of the underlying 
graph, different levels of false negative errors correspond to a 
single ordering for the influence graph, and hence a similar 
comparison for the case of false negatives is not relevant. 

Large Cyclic Graph {p = 1 000) . Our final numerical exper- 
iment with synthetic data compares the performance of RIPE with 
those of PCALG and FFLDR in reconstructing large cyclic graphs 
in the presence of both false positive and negative noise in the 
perturbation data. The setting of this simulation is similar to that 
of the previous section, with the main difference being the size of 
the graph and presence of cycles (feedback loops) in the true graph. 
In particular, a random cyclic graph with /> = 1 000 nodes and 
~2/)=1984 edges was generated, and n = 500 samples were 



generated from zero mean, unit variance Gaussian random 
variables, as steady state expression levels from the true network. 

As before, three different scenarios were considered: false 
positive errors {FP), false negative edges {FN), and both false 
positive and false negative edges {FP + FN) at 3 noise levels in 
each of the above scenarios. The noise levels were set up so that 
approximately 200, 400, and 600 erroneous edges were included 
in each of FP, FN, and FP + FN settings. 

The performance of the above methods for each of the input 
settings is given in Table 6, where the results for RIPE (with 1000 
random orderings) and PCALG are averages over 5 independent 
drawings of the steady state data. Figure 4C summarizes the values 
of Fi measures for different methods. These results confirm the 
findings of the previous section. In particular, RIPE outperforms 
the other two algorithms in all the simulated settings, and the 
difference between the performances of RIPE and FFLDR is 
magnified as more false positive edges are added to the 
perturbation graph. Finally, as expected, the PCALG does not 
compare favorably with the other two methods in the setting of 
cyclic graphs. 

Regulatory Network in Yeast 

To evaluate our method on real data, we explore the 
transcription factor (TF) regulatory network in Saccharomyces 
cerevisiae. The influence matrix V was estimated from a large 
transcription factor knockout experiment [39], that was subse- 
quently reanalyzed by [40] . Briefly, the experiment consists of 269 
knockouts or knockdowns of yeast transcription factors, investi- 
gated by hybridization to microarrays under normal conditions. In 
total, 588 custom-made two-color microarrays with wild-type 
standard total RNA used as reference strain were employed in four 
batches and three different strains. Reanalysis of the data involved 
within-array normalization, background correction, between 
array-normalization and corrections for batch effects and strain 
effects. The genes were ranked using a moderated Z-statistic, and 
FDR-correction was applied based on the method of [41]. 

We extracted the expression values for the genes corresponding 
to the 269 TF knockouts (resulting in a square matrix of expression 
values) and a p-value cut-off was chosen to 0.002 based on the 
same type of plot as for die DREAM4 networks (see Figure S9). 
The resulting influence matrix corresponds to a graph with a 
strongly connected component of size 113. 

The steady state experimental data employed come from a 
publicly available data set (Array Express E-TABM-773) and has 
been used (e.g., see [42]) to assess the day-to-day variation in large 
yeast array experiments. The data contains 200 samples, with 
samples from each day hybridized against a pool of wild-type 
strains. Although the samples are not necessarily independent, or 
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Figure 4. Performance of RIPE and competing metKiods on the reconstruction of synthetic networl<s. (A) Average F\ measures for 
reconstruction using NEM, PCALG, FFLDR and RIPE on a network of size /j = 20. V(, corresponds to the Ideal Influence graph and V\ to represent 
Increasing levels of noise In perturbation data (see also Figure S7). (B)-(C) Average F\ measures for reconstruction of synthetic regulatory networks 
using PCALG, FFLDR and RIPE for different levels of false positive and negative noise In perturbation data. Numbers In parentheses Indicate the 
expected number of false edges In each case. The true graph Is an acyclic graph (DAG) of size /; = 100 In (B) and a cyclic graph of size p= 1000 In (C). 
dol:10.1371/journal.pone.0082393.g004 

identically distributed (due to batch effects, temporal correlations 
etc), here we use this dataset as an approximation for i.i.d 
measurements. For the strong components whose size prohibits an 
exhaustive search, we apply the MC-DFS heuristic with 
m= 10,000 random permutations. 

The performance of the competing methods is based on the 
strategy outlined in the Preliminaries section above. Figure 5 
shows the histogram of the number of true positives for randomly 
generated graphs of the same size as each of the reconstructed 
networks, as well as those obtained from three dififerent methods 
Note that the NEM algorithm failed to produce an estimate of the 
network in this case after 1 0 days of run time and thus its results 
were not included. Interestingly, the RIPE estimate is the only 
method for which the number of true positives are significantly 
larger (at 5% significance level) than that obtained in a random 



Table 4. Innpact of 
the RIPE algorithnn. 


increas 


ng number of orderings used in 




Number of Orderlngs 


p 


R 


fi 


100 


60 


81 


69 


200 


61 


82 


70 


1000 


61 


82 


70 


2000 


60 


83 


70 


3926 (all) 


61 


84 


71 



Average performance measures. In percentages, for RIPE in the synthetic 
network V-^. 
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Table 5. Impact of increasing false positive and negative errors In perturbation data on estimation of acyclic graphs. 















P 






R 










PCALG 


FFLDR 


RIPE 


PCALG 


FFLDR 


RIPE 


PCALG 


FFLDR 


RIPE 




NO ERR 


70(3) 


99 


97(1) 


68(3) 


100 


97(2) 


73(3) 


98 


97(1) 


FP 


0.50% 


70(3) 


73 


96(1) 


68(3) 


65 


95(2) 


73(3) 


84 


97(1) 




1% 


70(3) 


48 


94(1) 


68(3) 


40 


91(2) 


73(3) 


62 


96(1) 




1 .50% 


70(3) 


45 


83(1) 


68(3) 


30 


77(2) 


73(3) 


97 


91(1) 


FN 


10% 


70(3) 


90 


90(1) 


68(3) 


93 


92(2) 


73(3) 


88 


87(1) 




20% 


70(3) 


85 


85(1) 


68(3) 


86 


88(1) 


73(3) 


85 


83(1) 




30% 


70(3) 


73 


74(1) 


68(3) 


76 


80(2) 


73(3) 


71 


69(1) 


FP + FN 


0.25%, 5% 


70(3) 


74 


92(1) 


68(3) 


65 


92(2) 


73(3) 


86 


93(2) 




0.5%, 10% 


70(3) 


58 


88(1) 


68(3) 


52 


88(2) 


73(3) 


66 


87(1) 




0.75%, 15% 


70(3) 


56 


88(1) 


68(3) 


48 


89(2) 


73(3) 


68 


87(1) 


Average performances measures, in percentages for PCALG, FFLDR and RIPE in the synthetic DAG of size p = 1 00 with different error structures. Numbers 
indicate the standard deviation of the estimates over 50 draw/s of simulated data (only for PCALG and RIPE). 


n parentheses 
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graph (0.0185 for RIPE compared to 0.1265 for FFLDR and 
0.5543 for PCALG). These findings further highlight the 
advantage of the proposed approach over existing methods. 

A comparison of the estimated graphs using these three methods 
indicates a significant overlap between RIPE and FFLDR 
reconstructions: specifically, out of 520 edges detected in RIPE 
and 622 edges in FFLDR, 225 edges are in common. On the other 
hand, the PCALG reconstruction has considerable less overlap 
with the other two estimates: 8 and 12 common edges with RIPE 
and FFLDR, respectively. The significant overlap between the 
FFLDR and RIPE estimates suggests that some of the edges 
detected by both methods may indeed correspond to true 
regulatory interactions in yeast, which are not included in the 
BIOGRID database. Such results could be used as a starting point 
for designing the corresponding validation experiments. 

As with the DREAM4 data, we also apphed ARACNE for 
estimation of the yeast network, and found that the Bonferroni 
adjusted p-value cutoff of 1.38 x 10^^ results in the best estimate. 



with 131 true positives (compared to BIOGRID) and 5594 total 
edges. This indicates that the estimated network based ARACNE 
is significantly denser compared to all other estimators. Evaluation 
of the significance of the number of true positives using the 
random graph method described above gives a /i-value of 0.0286, 
which indicates that ARACNE has a significantly larger propor- 
tion of true positives compared to FFLDR and PCALG. This is 
somewhat in agreement with the analysis on the DREAM4 data, 
where the combined performance of ARACNE based on the i^i 
measure was aflFected by its low Recall rate compared to the other 
methods. 

As pointed out in the Methods section, the RIPE algorithm can 
also be applied in the case where k<p. To illustrate this, we next 
describe the application of the RIPE algorithm for estimation of 
the entire regulatory network of yeast with k = 269 TF's and 
/) = 6051 total genes. In such settings, one often obtains 
perturbation data on a subset of genes of interest (here the set of 
TFs) and is interested in obtaining an estimate of the regulatory 



Table 6. Impact of increasing false positive and negative errors in perturbation data on estimation of high dimensional cyclic 
graphs. 













P 






R 










PCALG 


FFLDR 


RIPE 


PCALG 


FFLDR 


RIPE 


PCALG 


FFLDR 


RIPE 




NO ERR 


27(3) 


58 


63(1) 


31(3) 


89 


83(1) 


25(3) 


43 


50(1) 


FP 


0.02% 


27(3) 


47 


63(1) 


31(3) 


63 


83(1) 


25(3) 


38 


51(1) 




0.04% 


27(3) 


39 


61(1) 


31(3) 


43 


80(1) 


25(3) 


35 


50(1) 




0.06% 


27(3) 


34 


59(1) 


31(3) 


33 


73(1) 


25(3) 


35 


50(1) 


FN 


2% 


27(3) 


57 


61(1) 


31(3) 


88 


83(1) 


25(3) 


43 


49(1) 




4% 


27(3) 


55 


59(1) 


31(3) 


82 


81(1) 


25(3) 


41 


47(1) 




6% 


27(3) 


54 


59(1) 


31(3) 


82 


81(1) 


25(3) 


40 


46(1) 


FP + FN 


0.01%, 1% 


27(3) 


54 


62(1) 


31(3) 


73 


83(1) 


25(3) 


43 


49(1) 




0.02%, 2% 


27(3) 


48 


61(1) 


31(3) 


64 


80(1) 


25(3) 


39 


50(1) 




0.03%, 3% 


27(3) 


42 


61(1) 


31(3) 


51 


79(1) 


25(3) 


35 


50(1) 


Average performances measures, in percentages, for PCALG, FFLDR and RIPE in the synthetic cyclic network of size p = 1 000 with different error structures. Numbers in 
parentheses indicate the standard deviation of the estimates over 5 draws of simulated data (only for PCALG and RIPE). 
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PCALG 

TP =10, |E|=476 (p-value=0.5543) 



FFLDR 

TP =18, |E|=622 {p-value=0.1265) 



RIPE 

TP =19, |E|=520 (p-value=0.0185) 



no of TP 



no of TP 



no of TP 



Figure 5. Performance evaluation for the reconstruction of the yeast regulatory network. Number of true positives for each method, in 
comparison to the BIOGRID database, as well as a histogram for the number of true positives in randomly generated networks of the same size are 
shown. The p-values are obtained based on 10,000 randomly generated networks. 
doi:1 0.1 371/journal.pone.0082393.g005 



interactions among the set of perturbed genes and all other genes iii the 
network. In this example, this amounts to a network with edges 
from each of the 269 TF's to each of the genes (TF's and TG's) in 
the network. Specifically, this corresponds to a 2-layer graph 
consisting of edges amongst TF's, as well as ed,ges between TF's 
and TG's. 

Considering the fact that perturbation data are only available 
for a subset of k perturbed genes, one has to impose a constraint on 
the orderings between perturbed genes and the rest of the genes in 
the network. A natural constraint is to assume that no edges exist 
from unperturbed genes to the perturbed ones. This assumption 
defines a clear choice for ordering of nodes in the graph: the set of 
perturbed genes (say 1 to k) appear before the unperturbed ones in 
any ordering of nodes. It is then clear that RIPE can be applied to 
estimate the regulatory edges in the network by obtaining 
estimates of the two-layer network using the penalized regression 
approach in (3). The computational efficiency of the RIPE 
algorithms facilitates its application to estimation of the entire 
network regulatory interactions based on limited perturbation 
data. Estimation of the regulatory network of yeast with p = 605 1 
genes and k = 269 transcription factors based on 1000 orderings 
takes less than 22 minutes on a 2.7 GHz Laptop with 6 GB of 
memory. The resulting estimate includes 134 interactions reported 
in the BIOGRID dataset (true positives) and a total of 10014 
edges. In an experiment similar to those reported above, the 
number of true positives in 1 000 random graphs with the same 
number of edges and similar 2-layer structure no network with 
equal or larger true positives was observed (p-value< 0.001). The 
distribution of number of true positives in comparison to the 
number of true positives for the RIPE estimator are shown in 
Figiare SIO. 

Discussion 

The proposed methodology offers several advantages over 
existing approaches in addressing the key problem of reconstruct- 
ing of regulatory networks. It relies on a global assessment of 
causal orderings and employs both perturbation screens and 
steady state expression data for the reconstruction step that boosts 
performance. Further, the penalized likelihood method used for 
estimating the edges exhibits a certain degree of robustness to 
misspecification of the causal orderings, as observed in [13]. As 
mentioned in the introductory section, highly accurate perturba- 



tion data may be sufficient for die reconstruction task at hand, but 
this is not often the case. On the other hand, integrating two data 
sources proves beneficial, as our numerical work illustrates. We 
discuss next several issues related to the RIPE algorithm and 
oudine some future research directions. 

As previously indicated, scalability issues are important to the 
proposed methodology, since the influence matrix V usually 
contains cycles due to natural feedback loops in gene mechanisms 
and the noisy measurements in the perturbation experiments. 
Hence, calculating all possible causal orderings compatible with V 
may become infeasible given the exponential complexity of the 
problem. The proposed MC-DFS heuristic offers a fast, reliable 
alternative. Our numerical experiments suggest that in practice it 
is not required to exhaustively search the space of possible 
orderings, and a moderate number of randomly generated 
orderings often produce comparable estimates. Based on the 
results reported here, for graphs of up to /i = 1 000 nodes, a total of 
~ 1000 orderings results in reliable estimates. 

Our extensive evaluation studies strongly suggest that the RIPE 
algorithm is especially suitable in settings where one deals with 
noisy perturbation data and fairly good quality steady state 
expression data are available. Algorithms utilizing only informa- 
tion from perturbation screens work well for topologies without 
many cycles, while those relying only on observational data are not 
particularly competitive. Further, other data sources that can 
further filter the influence matrix, such as binding experiments 
(e.g. based on ChlP-chip technology), would be beneficial. 

The proposed methodology is in principle apphcable to other 
organisms for which steady state gene expression data with large 
sample size exist, such as Arabidopsis, mouse and human. The 
bottleneck would be the paucity of systematic perturbation screens, 
since they are more costiy to produce. But the RIPE algorithm 
would be well suited for estimating a regulatory subnetwork for 
which adequate perturbation data are available. 

An interesting extension of the proposed methodology involves 
perturbation screens from time course data [8]. The RIPE 
algorithm can be extended as follows. Relying on the fact that the 
bperturbation screens convey the causal ordering of the genes in 
the network, the DAG scoring method itself can be extended to 
cover time course steady state expression data. For example, a 
modified version of the likeliliood scoring of DAGs can be used to 
shrink the estimated networks in each time point towards a 
common skeleton for the underlying network (similar to the 
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approach described in [43]), or each time point can be modeled as 
a modiGed version of the network at the previous time point [44] . 

Supporting Information 

Figure SI Small cyclic subnetwork example. The true 
network (left) includes a number of cycles, and the estimate from 
the RIPE algorithm correcdy identifies some of these cycles (right). 

(TIFF) 

Figure S2 Numerical study on choices of r and q. Values 
of the Precision (P) for different combinations of T (threshold for 
including an edge in the consensus graph) and q (proportion of 
highest values of the log-likehhood function used in constructing 
the consensus graph). 
(TIFF) 

Figure S3 Numerical study on choices of T and q. Values 
of the Recall {R) for different combinations of t (threshold for 
including an edge in the consensus graph) and q proportion of 
highest \ alues of the log-likehhood function used in constructing 
the consensus graph). 
(TIFF) 

Figure S4 Numerical study on choices of T and q. Values 
of the Fi measure for different combinations of T (threshold for 
including an edge in the consensus graph) and q (proportion of 
highest values of the log-likehhood function used in constructing 
the consensus graph). 
(TIFF) 

Figure S5 Influence graph characteristics versus p- 
value for knockdown experiments in DREAM4. Number 

of edges (open triangles) in the influence graph and the size of the 
largest connected component (dots) versus cut-off p-value for 
differential expression. The data is based on five replications of the 
knockdown and wildtype experiments for (A) 100-node network 1, 
(B) 100-node network 3, and (C) 100-node network 5 in the 
DREAM4 challenge. P-values chosen for analysis were 0.03 for (A) 
and (C), and 0.027 for (B). 
(TIFF) 

Figure S6 Synthetic regulatory network. 

(TIFF) 

Figure S7 Illustration of influence matrices for synthet- 
ic networks. Vq'- ground truth, Vi'. 5% of directions reversed, 
V2'- 10% new effects added, Vs'. 5% cUrections reversed and 10% 
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new effects added. A black dot in position (i.e., in row i and 

column]) represents that gene i inuences genej. 

(TIFF) 

Figure S8 Effect of increasing number of ordering used 

in RIPE. The values of i^i, P and R are displayed for increasing 
number of orders in the synthetic DAG with^= 100 nodes under 
1.5% false positive edges. 
(TIFF) 

Figure S9 Influence graph characteristics versus p- 
value for the yeast regulatory network. Number of edges 
(open triangles) in the influence graph and the size of the largest 
connected component (dots) versus cut-off p-value for differential 
expression in the transcription factor knockout experiments in 
yeast. The graph is based on expression data from 588 two-color 
microarrays for 269 transcription factors. The p-value cut-off was 
chosen to 0.002. 
(TIFF) 

Figure SIO Performance of RIPE in estimating the 
layered yeast regulatory network. The distribution of 

number of true positives, in comparison to the number of true 
positives for the RIPE estimator, for 1 000 random graphs with the 
same number of edges and similar 2-layer structure (p = 605 1 
genes and k=269 transcription factors). No random network with 
equal or larger true positives was observed (p-value<0:001). 
(TIFF) 

Figure Sll Simple graphs with 3 nodes. 

(TIFF) 
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